The First Surface Realisation Shared Task: Overview and Evaluation Results
نویسندگان
چکیده
The Surface Realisation (SR) Task was a new task at Generation Challenges 2011, and had two tracks: (1) Shallow: mapping from shallow input representations to realisations; and (2) Deep: mapping from deep input representations to realisations. Five teams submitted six systems in total, and we additionally evaluated human toplines. Systems were evaluated automatically using a range of intrinsic metrics. In addition, systems were assessed by human judges in terms of Clarity, Readability and Meaning Similarity. This report presents the evaluation results, along with descriptions of the SR Task Tracks and evaluation methods. For descriptions of the participating systems, see the separate system reports in this volume, immediately following this results report.
منابع مشابه
The Surface Realisation Task: Recent Developments and Future Plans
The Surface Realisation Shared Task was first run in 2011. Two common-ground input representations were developed and for the first time several independently developed surface realisers produced realisations from the same shared inputs. However, the input representations had several shortcomings which we have been aiming to address in the time since. This paper reports on our work to date on i...
متن کاملDCU at Generation Challenges 2011 Surface Realisation Track
In this paper we describe our system and experimental results on the development set of the Surface Realisation Shared Task. DCU submitted 1-best outputs for the Shallow subtask of the shared task, using a surface realisation technique based on dependency-based n-gram models. The surface realiser achieved BLEU and NIST scores of 0.8615 and 13.6841 respectively on the SR development set.
متن کاملThe TUNA Challenge 2008: Overview and Evaluation Results
The TUNA Challenge was a set of three shared tasks at REG’08, all of which used data from the TUNA Corpus. The three tasks covered attribute selection for referring expressions (TUNA-AS), realisation (TUNA-R) and end-toend referring expression generation (TUNAREG). 8 teams submitted a total of 33 systems to the three tasks, with an additional submission to the Open Track. The evaluation used a ...
متن کاملThe First QALB Shared Task on Automatic Text Correction for Arabic
We present a summary of the first shared task on automatic text correction for Arabic text. The shared task received 18 systems submissions from nine teams in six countries and represented a diversity of approaches. Our report includes an overview of the QALB corpus which was the source of the datasets used for training and evaluation, an overview of participating systems, results of the compet...
متن کاملGeneration Challenges 2011 Preface
Generation Challenges 2011 (GenChal’11) was the fifth round of shared-task evaluation competitions (STECs) involving the generation of natural language. It followed four previous events: the Pilot Attribute Selection for Generating Referring Expressions (ASGRE) Challenge in 2007 which had its results meeting at UCNLG+MT in Copenhagen, Denmark; Referring Expression Generation (REG) Challenges in...
متن کامل